AITopics | arxiv prepr

Collaborating Authors

arxiv prepr

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improved Exploration in GFlownets via Enhanced Epistemic Neural Networks

Muhammad, Sajan, Lahlou, Salem

arXiv.org Artificial IntelligenceOct-23-2025

Efficiently identifying the right trajectories for training remains an open problem in GFlowNets. To address this, it is essential to prioritize exploration in regions of the state space where the reward distribution has not been sufficiently learned. This calls for uncertainty-driven exploration, in other words, the agent should be aware of what it does not know. This attribute can be measured by joint predictions, which are particularly important for combinatorial and sequential decision problems. In this research, we integrate epistemic neural networks (ENN) with the conventional architecture of GFlowNets to enable more efficient joint predictions and better uncertainty quantification, thereby improving exploration and the identification of optimal trajectories. Our proposed algorithm, ENN-GFN-Enhanced, is compared to the baseline method in GFlownets and evaluated in grid environments and structured sequence generation in various settings, demonstrating both its efficacy and efficiency.

artificial intelligence, gflownet, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2506.16313

Country: Asia > Middle East > UAE (0.14)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Bridging Brain with Foundation Models through Self-Supervised Learning

Altaheri, Hamdi, Karray, Fakhri, Islam, Md. Milon, Raju, S M Taslim Uddin, Karimi, Amir-Hossein

arXiv.org Artificial IntelligenceJun-23-2025

Foundation models (FMs), powered by self-supervised learning (SSL), have redefined the capabilities of artificial intelligence, demonstrating exceptional performance in domains like natural language processing and computer vision. These advances present a transformative opportunity for brain signal analysis. Unlike traditional supervised learning, which is limited by the scarcity of labeled neural data, SSL offers a promising solution by enabling models to learn meaningful representations from unlabeled data. This is particularly valuable in addressing the unique challenges of brain signals, including high noise levels, inter-subject variability, and low signal-to-noise ratios. This survey systematically reviews the emerging field of bridging brain signals with foundation models through the innovative application of SSL. It explores key SSL techniques, the development of brain-specific foundation models, their adaptation to downstream tasks, and the integration of brain signals with other modalities in multimodal SSL frameworks. The review also covers commonly used evaluation metrics and benchmark datasets that support comparative analysis. Finally, it highlights key challenges and outlines future research directions. This work aims to provide researchers with a structured understanding of this rapidly evolving field and a roadmap for developing generalizable brain foundation models powered by self-supervision.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2506.16009

Country:

Europe (0.45)
North America > Canada (0.27)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

VL-SAFE: Vision-Language Guided Safety-Aware Reinforcement Learning with World Models for Autonomous Driving

Qu, Yansong, Huang, Zilin, Sheng, Zihao, Chen, Jiancong, Chen, Sikai, Labi, Samuel

arXiv.org Artificial IntelligenceMay-23-2025

-- Reinforcement learning (RL) - based autonomous driving policy learning faces critical limitations such as low sample efficiency and poor generalization; its reliance on online interactions and trial - and - error learning is especially unacceptable in safety - critical scenarios. Existing methods including s afe RL often fail to capture the true semantic meaning of "safety" in complex driving contexts, leading to either overly conservative driving behavior or constraint violations . To address these challenges, we propose VL - SAFE, a world model - based safe RL framework with Vision - Language model ( VLM) - as - safety - guidance paradigm, designed for offline safe policy learning. Specifically, we construct offline datasets containing data collected by expert agents and labeled with safety scores derived from VLMs. A world model is trained to generate imagined rollouts together with safety estimations, allowing the agent to perform safe planning without interacting with the real environment. Based on these imagined trajectories and safety evaluations, actor - critic le arning is conducted under VLM - based safety guidance to optimize the driving policy more safely and efficiently. Extensive evaluations demonstrate that VL - SAFE achieves superior sample efficiency, generalization, safety, and overall performance compared to existing baselines. To the best of our knowledge, this is the first work that introduces a VLM - guided world model - based approach for safe autonomous driving. The demo video and code can be accessed at: https://ys - qu.github.io/vlsafe - website/ Index Terms -- vision - language models; world model; safe reinforcement learning; autonomous driving C urrent transportation management methods have produced significant improvements [3], [4] . Autonomous driving, with its advanced intelligence and automation [5], offers a promising solution . At its core, autonomous driving aims to perceive [6], understand [7], and interact [8] with complex dynamic systems in real time. Yansong Qu is with the Lyles School of Civil and Construction Engineering, Purdue University, West Lafayette, 47907, USA (e - mail: qu 120@purdu.edu

machine learning, reinforcement learning, world model, (16 more...)

arXiv.org Artificial Intelligence

2505.16377

Country:

Asia > China (1.00)
North America > United States > Wisconsin (0.15)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

RAR: Setting Knowledge Tripwires for Retrieval Augmented Rejection

Buonocore, Tommaso Mario, Parimbelli, Enea

arXiv.org Artificial IntelligenceMay-21-2025

Content moderation for large language models (LLMs) is increasingly critical as LLMs are deployed in user-facing applications. Traditional moderation often relies on static classifiers or handcrafted prompt filters, which struggle to adapt quickly to new threats [1] . Recent analyses show that even retrieval-augmented generation (RAG) pipelines can inadvertently introduce safety risks, causing models to change their safety profile [2] . This paper presents retrieval-augmented rejection (RAR), a novel approach that repurposes the RAG architecture [3], typically used to enhance LLM knowledge, as a dynamic content moderation mechanism. By intentionally adding documents that mimic harmful content and questions (which we term "negative documents") to the vector database and flagging them accordingly, the system can leverage the retrieval mechanism to identify and reject malicious queries without requiring model retraining or architectural changes. The key contributions of this work include: i) a novel content moderation approach that requires no architectural changes to existing RAG systems; ii) a methodology for creating and maintaining "negative documents" for effective query filtering; iii) a flexible threshold-based rejection mechanism that can be dynamically adjusted; iv) preliminary evaluation against existing content moderation approaches. 1 1.1.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.13581

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.69)
Information Technology > Security & Privacy (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Do Large Language Models know who did what to whom?

Denning, Joseph M., Guo, Xiaohan Hannah, Snefjella, Bryor, Blank, Idan A.

arXiv.org Artificial IntelligenceApr-29-2025

Large Language Models (LLMs) are commonly criticized for not "understanding" language. However, many critiques focus on cognitive abilities that, in humans, are distinct from language processing. Here, we instead study a kind of understanding tightly linked to language: inferring "who did what to whom" (thematic roles) in a sentence. Does the central training objective of LLMs--word prediction--result in sentence representations that capture thematic roles? In two experiments, we characterized sentence representations in four LLMs. In contrast to human similarity judgments, in LLMs the overall representational similarity of sentence pairs reflected syntactic similarity but not whether their agent and patient assignments were identical vs. reversed. Furthermore, we found little evidence that thematic role information was available in any subset of hidden units. However, some attention heads robustly captured thematic roles, independently of syntax. Therefore, LLMs can extract thematic roles but, relative to humans, this information influences their representations more weakly.

large language model, machine learning, thematic role assignment, (20 more...)

arXiv.org Artificial Intelligence

2504.16884

Country: North America > United States (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Using LLMs as prompt modifier to avoid biases in AI image generators

Peinl, René

arXiv.org Artificial IntelligenceApr-16-2025

This study examines how Large Language Models (LLMs) can reduce biases in text-to-image generation systems by modifying user prompts. We define bias as a model's unfair deviation from population statistics given neutral prompts. Our experiments with Stable Diffusion XL, 3.5 and Flux demonstrate that LLM-modified prompts significantly increase image diversity and reduce bias without the need to change the image generators themselves. While occasionally producing results that diverge from original user intent for elaborate prompts, this approach generally provides more varied interpretations of underspecified requests rather than superficial variations. The method works particularly well for less advanced image generators, though limitations persist for certain contexts like disability representation. All prompts and generated images are available at https://iisys-hof.github.io/llm-prompt-img-gen/

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2504.11104

Country:

North America > United States (0.46)
Europe > Germany (0.29)

Genre: Research Report (0.70)

Industry:

Government > Military (0.47)
Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Towards 3D Semantic Scene Completion for Autonomous Driving: A Meta-Learning Framework Empowered by Deformable Large-Kernel Attention and Mamba Model

Qu, Yansong, Huang, Zilin, Sheng, Zihao, Chen, Tiantian, Chen, Sikai

arXiv.org Artificial IntelligenceNov-6-2024

Semantic scene completion (SSC) is essential for achieving comprehensive perception in autonomous driving systems. However, existing SSC methods often overlook the high deployment costs in real-world applications. Traditional architectures, such as 3D Convolutional Neural Networks (3D CNNs) and self-attention mechanisms, face challenges in efficiently capturing long-range dependencies within 3D voxel grids, limiting their effectiveness. To address these issues, we introduce MetaSSC, a novel meta-learning-based framework for SSC that leverages deformable convolution, large-kernel attention, and the Mamba (D-LKA-M) model. Our approach begins with a voxel-based semantic segmentation (SS) pretraining task, aimed at exploring the semantics and geometry of incomplete regions while acquiring transferable meta-knowledge. Using simulated cooperative perception datasets, we supervise the perception training of a single vehicle using aggregated sensor data from multiple nearby connected autonomous vehicles (CAVs), generating richer and more comprehensive labels. This meta-knowledge is then adapted to the target domain through a dual-phase training strategy that does not add extra model parameters, enabling efficient deployment. To further enhance the model's capability in capturing long-sequence relationships within 3D voxel grids, we integrate Mamba blocks with deformable convolution and large-kernel attention into the backbone network. Extensive experiments demonstrate that MetaSSC achieves state-of-the-art performance, significantly outperforming competing models while also reducing deployment costs.

dataset, scenario, scene completion, (15 more...)

arXiv.org Artificial Intelligence

2411.03672

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (0.63)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Super-resolved virtual staining of label-free tissue using diffusion models

Zhang, Yijie, Huang, Luzhe, Pillar, Nir, Li, Yuzhu, Chen, Hanlong, Ozcan, Aydogan

arXiv.org Artificial IntelligenceOct-26-2024

Virtual staining of tissue offers a powerful tool for transforming label-free microscopy images of unstained tissue into equivalents of histochemically stained samples. This study presents a diffusion model-based super-resolution virtual staining approach utilizing a Brownian bridge process to enhance both the spatial resolution and fidelity of label-free virtual tissue staining, addressing the limitations of traditional deep learning-based methods. Our approach integrates novel sampling techniques into a diffusion model-based image inference process to significantly reduce the variance in the generated virtually stained images, resulting in more stable and accurate outputs. Blindly applied to lower-resolution auto-fluorescence images of label-free human lung tissue samples, the diffusion-based super-resolution virtual staining model consistently outperformed conventional approaches in resolution, structural similarity and perceptual accuracy, successfully achieving a super-resolution factor of 4-5x, increasing the output space-bandwidth product by 16-25-fold compared to the input label-free microscopy images. Diffusion-based super-resolved virtual tissue staining not only improves resolution and image quality but also enhances the reliability of virtual staining without traditional chemical staining, offering significant potential for clinical diagnostics.

artificial intelligence, diffusion model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.20073

Country: North America > United States > California > Los Angeles County > Los Angeles (0.29)

Genre:

Research Report > Experimental Study (0.69)
Research Report > New Finding (0.68)

Industry:

Health & Medicine > Diagnostic Medicine (0.70)
Health & Medicine > Therapeutic Area (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Safeguarding Large Language Models: A Survey

Dong, Yi, Mu, Ronghui, Zhang, Yanghao, Sun, Siqi, Zhang, Tianle, Wu, Changshun, Jin, Gaojie, Qi, Yi, Hu, Jinwei, Meng, Jie, Bensalem, Saddek, Huang, Xiaowei

arXiv.org Artificial IntelligenceJun-3-2024

In the burgeoning field of Large Language Models (LLMs), developing a robust safety mechanism, colloquially known as "safeguards" or "guardrails", has become imperative to ensure the ethical use of LLMs within prescribed boundaries. This article provides a systematic literature review on the current status of this critical mechanism. It discusses its major challenges and how it can be enhanced into a comprehensive mechanism dealing with ethical issues in various contexts. First, the paper elucidates the current landscape of safeguarding mechanisms that major LLM service providers and the open-source community employ. This is followed by the techniques to evaluate, analyze, and enhance some (un)desirable properties that a guardrail might want to enforce, such as hallucinations, fairness, privacy, and so on. Based on them, we review techniques to circumvent these controls (i.e., attacks), to defend the attacks, and to reinforce the guardrails. While the techniques mentioned above represent the current status and the active research trends, we also discuss several challenges that cannot be easily dealt with by the methods and present our vision on how to implement a comprehensive guardrail through the full consideration of multi-disciplinary approach, neural-symbolic method, and systems development lifecycle.

arxiv prepr, language model, llm, (11 more...)

arXiv.org Artificial Intelligence

2406.02622

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(11 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Workflow (0.93)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Limits of Fair Medical Imaging AI In The Wild

Yang, Yuzhe, Zhang, Haoran, Gichoya, Judy W, Katabi, Dina, Ghassemi, Marzyeh

arXiv.org Artificial IntelligenceDec-11-2023

As artificial intelligence (AI) rapidly approaches human-level performance in medical imaging, it is crucial that it does not exacerbate or propagate healthcare disparities. Prior research has established AI's capacity to infer demographic data from chest X-rays, leading to a key concern: do models using demographic shortcuts have unfair predictions across subpopulations? In this study, we conduct a thorough investigation into the extent to which medical AI utilizes demographic encodings, focusing on potential fairness discrepancies within both in-distribution training sets and external test sets. Our analysis covers three key medical imaging disciplines: radiology, dermatology, and ophthalmology, and incorporates data from six global chest X-ray datasets. We confirm that medical imaging AI leverages demographic shortcuts in disease classification. While correcting shortcuts algorithmically effectively addresses fairness gaps to create "locally optimal" models within the original data distribution, this optimality is not true in new test settings. Surprisingly, we find that models with less encoding of demographic attributes are often most "globally optimal", exhibiting better fairness during model evaluation in new test environments. Our work establishes best practices for medical imaging models which maintain their performance and fairness in deployments beyond their initial training contexts, underscoring critical considerations for AI clinical deployments across populations and sites.

dataset, fairness, fairness gap, (16 more...)

arXiv.org Artificial Intelligence

2312.10083

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(3 more...)

Add feedback